Characterization of new psychoactive substances by integrating benchtop NMR to multi‐technique databases

Abstract New psychoactive substances (NPS) have become a serious threat for public health due to their ability to be sold in the street or on internet. NPS are either derived from commercial drugs which are misused (recreational rather than medical use) or whose structure is slightly modified. To regulate NPS, it is essential to accurately characterize them, either to recognize molecules that were previously identified or to quickly elucidate the structure of unknown ones. Most approaches rely on the determination of the exact mass obtained by high‐resolution mass spectrometry requiring expensive equipment. This motivated us to develop a workflow in which the elucidation is assisted with databases and does not need the exact mass. This workflow combines 1D and 2D NMR measurements performed on a benchtop spectrometer with IR spectroscopy, for creating a multi‐technique database to characterize pure and mixed NPS. The experimental database was created with 57 entries mostly coming from seizures, mainly cathinones, cannabinoids, amphetamines, arylcyclohexylamines, and fentanyl. A blind validation of the workflow was carried out on a set of six unknown seizures. In the first three cases, AF, AB‐FUBINACA, and a mixture of 2C‐I and 2C‐E could be straightforwardly identified with the help of their reference spectra in the database. The two next samples were elucidated for the first time with the help of the database to reveal NEK and MPHP substances. Finally, a precise quantification of each characterized NPS was obtained in order to track NPS trafficking networks.


| INTRODUCTION
New psychoactive substances (NPS) 1 encompass several families of molecules mimicking the effects of different conventional illicit products, for example, ecstasy, cocaine, or cannabis. NPS are either derived from commercial drugs which have been misused (recreational rather than medical use) or whose structure has been slightly modified. As a consequence of such chemical modifications, these hazardous compounds are not always controlled under the International Drug Control Conventions, and their legal status often remains undefined. At the end of 2020, the European Monitoring Center of Drugs and Drugs Addiction (EMCDDA) followed approximately 800 NPS from different families. 2,3 These NPS include synthetic cannabinoids derived from marijuana, ketamine-like compounds which are used as anesthetics, fentanyloids which is a painkiller more powerful than morphine, and benzodiazepines that are used for curing anxiety, cathinones, and phenethylamines. The effects of those substances generally remain unknown and their consumption can lead to intoxication, overdose, or even death. Due to the growing consumption of NPS on the market as well as the difficulty to constantly adapt the regulation of European members, there is an urgent need for accessible and reliable analytical tools allowing improved tracking of NPS seized by the Police and/or purchased on the internet. Indeed, it is essential to accurately characterize NPS, to recognize molecules that were previously identified, and to elucidate the structure of unknown ones. Identification of known substances consists in comparing experimental data of a seized compound with reference data, which is quite straightforward. In contrast, elucidation is a more challenging task, necessary when identification fails and consisting in determining the structure of the unknowns in a sample from a set of combined analytical data. An efficient NMR-based approach named CASE (computerassisted spectral elucidation) has been developed for structure elucidation of small molecules. 4 CASE uses software to generate all possible molecular structures that are consistent with a particular set of 1D and 2D NMR data. However, CASE relies on the determination of the exact mass obtained by high-resolution mass spectrometry (HRMS), a technique which is not widely available in forensic science services.
This motivated us to develop a workflow in which the elucidation is assisted with databases and does not require prior knowledge of the exact mass.
NPS characterization can be achieved by a variety of complementary analytical techniques. The first one is gas chromatography coupled to mass spectrometry (GC-MS), currently the most used in forensics. 5 MS combined with chemical ionization (CI) may be used to provide the molecular weight, but a relatively expensive HRMS instrumentation is required to access accurate molecular weight. MS combined with electron ionization (EI) mode provides information on different fragments of the molecule, and hyphenated MS/MS (or MS n ) delivers further insights into the fragmentation schemes of the molecule. Another important technique is infrared (IR) spectroscopy which is mainly used to determine chemical functions within a molecule, but can also be applied for spectral recognition. [6][7][8] Unfortunately, these methods provide limited structural information. NMR spectroscopy is probably the most powerful structure identification and elucidation tool for small molecules such as NPS. Indeed, it offers highly accurate and specific information on the chemical environment of all atoms through chemical shifts, while providing crucial input on atomic connectivities through J-couplings. Moreover, highly accurate quantification is achievable with NMR with limited sample preparation and a short acquisition time for 1D experiments. However, high-field NMR ( 1 H frequency > 300 MHz) is rarely used in forensics due to the high purchase, maintenance, and running costs in addition to the need for dedicated staff and to the bulkiness of the NMR instrument. However, during the last decade, new compact NMR apparatus emerged, with magnetic fields between 1 and 2.1 T. These instruments can be installed on a benchtop, are easily transportable, and do not require maintenance (no cryogenic fluids). Nevertheless, benchtop NMR remains a young and underexplored technique for NPS characterization as compared with conventional high-field NMR. 9,10 Also, benchtop NMR has to face intrinsic sensitivity and resolution limits, which are illustrated in Figure 1 in the case of two common NPS.
Such limitations make it difficult to only rely on benchtop NMR for structure identification or elucidation. Therefore, combinations with other methods like GC-MS or IR and/or with databases become a prerequisite for a reliable, fast, and automated characterization. Two F I G U R E 1 Spectrum of (a) 370mM AMB-FUBINACA in DMSO-d6 and (b) 306mM 2-BMMP in DMSO-d6 acquired at high-field (top: 400 MHz) and on a benchtop spectrometer (bottom: 60 MHz). Peak assignment after phasing and baseline correction is indicated. The two spectra are obtained with the same conditions: eight scans, a repetition time (TR) of 30 s, and an acquisition time of 1.6 s at 299.6 K. Most of the signals are overlapped at 60 MHz especially for the aromatic area of AMB-FUBINACA. This illustrates the resolution limitation of benchtop NMR. Indeed, since the resolution scales linearly with the magnetic field B0 in theory, the 400 MHz spectrum is expected to be roughly seven times more resolved than at 60 MHz. As regards sensitivity, it scales with B0 3/2 ; hence, the 400 MHz spectrum is expected to be approximately 17 times more sensitive than a 60 MHz spectrum recorded in identical conditions [Colour figure can be viewed at wileyonlinelibrary.com] such multi-method approaches were already explored in a forensic framework 11 In a first study merging NMR and GC-MS, 1 H NMR spectra of reference compounds were collected using an 80 MHz instrument to create a reference library of 302 spectra of different NPS classes. 12 Next 432 seized samples were analyzed by NMR and GC-MS for cross-validation. 1 H NMR analysis nicely matched the GC-MS results with a 93% consistency rate. Another study reported the identification of drugs in two case samples, using a limited library of 12 spectra on an 80 MHz benchtop spectrometer, visually compared with data collected on a 600 MHz spectrometer. 13 In agreement with GC-MS analysis, the seized samples were found to contain morphine, acetyl codeine, MAM, and MDMA. In spite of these encouraging preliminary results, there is no general workflow incorporating benchtop NMR for the identification and elucidation of NPS structures. A first reason is that the above-mentioned examples were limited to 1D 1 H NMR spectroscopy and did not include 2D NMR experiments, which should provide a much higher degree of confidence in terms of structure characterization. Moreover, the reported elucidation procedures rely on the determination of the exact mass as an initial step, but as explained above, HRMS instruments are uncommon in forensic laboratories. It is also important to note that the SWDRUGS group recommends using at least two techniques for unambiguous identification, which motivated us to combine benchtop NMR with another analytical approach. 14 Databases can be used with other analytical techniques such as IR. For instance, in a study by Jones et al., 221 samples-most of them in mixtures-were screened using IR and Raman. 15 Authors compared the spectra to the database, and if a reference matched, they subtracted the reference spectra to the unknown one and made another comparison with the database. Only 41% of samples were unambiguously identified. Other samples were sent to MS and NMR, for a full elucidation procedure. With this approach, 33 samples were identified and added to the IR and Raman library.
About NMR quantification, some previous works were carried out on benchtop NMR spectra. Naqi et al. quantified seized MDMA by using 1 H-qNMR, UHPLC, and UHPLC-MS. 16 Maleic acid and MDMA-d 5 were used, respectively, as an internal reference for NMR and UHPLC measurements. The MDMA concentrations determined by UHPLC and NMR were found comparable with no significant statistical difference as revealed by ANOVA single factor analysis. Quantitative NMR with benchtop NMR was applied to drugs with similar molecular structures. For example, Hussain et al. were able to quantify the mass weight of MDMA contained in a tablet. 17 The values obtained on eight tablets were between 209 and 212 mg and are comparable with results given by GC-MS.
In this context, this work aims to develop and evaluate a general workflow based on optimized 1D and 2D NMR measurements performed on a benchtop spectrometer, combined with IR spectroscopy, for creating a multi-technique database allowing the characterization of pure and mixed NPS. Both identification and elucidation are considered in this workflow, which uses accessible analytical techniques while keeping in mind the demand of end-users in terms of reliability, robustness, and experiment time. After describing its implementation and the choice of the key parameters, the workflow was blind-tested on six real-seized samples, and the quantification capabilities of benchtop NMR were also assessed.

| Sample preparation
Fifty-seven samples were used in this study. Five of them were purchased from Lipomed, seven samples were seized by the Finnish police, while all the others came from seizures by the French police.
Also, six unknown seizures were obtained from the French police to allow validating the database. DMSO-d 6 purchased from Eurisotop with a purity of 99.8% was used as an NMR solvent. TMS from Acros Organics with a purity of 99.9% and TSP from Eurisotop with a purity of 99.8% were used as chemical shift and concentration references, respectively. All seizures were dissolved in DMSO-d 6 to obtain a concentration as close as possible to 300 mM (see Table S1), and 10 μl of TMS was added to calibrate 1 H and 13 C chemical shifts.

| NMR
All the spectra were recorded at 26.5 C using a 1 H, 19 F, 13  Three NMR pulse sequences were chosen for our study, 1D 1 H, 1D 19 F, and 2D 1 H-13 C heteronuclear single quantum coherence (HSQC). The HSQC sequence was improved by optimization with the help of a test molecule, named 2-FDCK (see Table S1). Three variants of the HSQC sequence were tested, namely, Echo-Antiecho (EA) HSQC with WALTZ decoupling, phase sensitive (PH) HSQC with WALTZ decoupling, PH HSQC with Multiplicity Edition (ME), and WALTZ decoupling. PH HSQC was found in average twice more sensitive than EA HSQC (see S2). The PH HSQC ME (see S3) was found to be the most informative since it allowed determining the multiplicity n of 13 C( 1 H)n but this pulse sequence was 20% less sensitive than PH HSQC. The most important acquisitions parameters are gathered in Table 1.
The NMR processing was performed with MestreNova™ software (version 14.1). The 1D free induction decays were first multiplied by an exponential apodization function (0.3 Hz line broadening) and zero-filled to a factor 4 then manually phased and baseline corrected with a Whittaker smoother algorithm. The 2D free induction decays were first multiplied by a cosine apodization function and zero-filled to a factor 4 in both dimensions, then manually phased and baseline corrected with Whittaker smoother algorithm. 1 H and 13 C chemical shifts were referenced to TMS, and 19 F chemical shifts were referenced to the internal lock substance of the spectrometer.

| IR
Analyses were carried out with a Bruker tensor 27 spectrometer in Attenuated Total Reflectance (ATR) mode. Each spectrum was acquired with 32 scans with a 400-4000 cm À1 spectral range and a 2 cm À1 spectral resolution. The processing of the spectra was achieved with the Opus™ (v8.5) software. CO 2 , H 2 O compensation, baseline, and smoothing were applied to all IR spectra to obtain more reliable NPS IR fingerprints.

| ACD/Labs software
All analytical data were gathered in the ACD/Labs software (version 2020.1.0 of July 15, 2020) from Advanced Chemistry Development, Inc (ACD/Labs). First, a project with all NMR experiments and the dedicated structure was created for each molecule and added to the database after peak picking and assignment of all signals and correlations. Then the corresponding IR spectrum was added to the database after an automatic peak picking.

| Database parameters
Once the database has been created, two search methods are available in the ACD/Labs software to compare an unknown spectrum with one from the database. The first one, denoted "similarity search", compares the shape of the spectra only, disregarding peak picking and the multiplicity of the signals. The second one, called "peak searching", compares the peak picking of the NMR signal, an approach which is, obviously, significantly operator-dependent. With both search methods, the comparison between the experimental spectrum and the database yields a result called the Hit Quality Index (HQI) which quantifies the agreement between the database spectrums the most closely matching and the spectrum of the unknown compound.
In the "similarity search" method, HQI can be calculated with two algorithms detailed below. In both cases, the query spectrum is first indexed and split into N small regions. Next, all regions are integrated and, depending on the calculated integral, an index in the [À127;+127] range is assigned to each region.
The first algorithm relies on the calculation of the Euclidian difference and treats each spectrum as a vector in a N-dimensional space.
Thus, the resulting HQI is a measurement of the angle between the vectors giving a value between 0 (best match) and the square root of 2 (worst match). In the ACD/Labs software, this value is next scaled to deliver a value between 0 (worst match) and 100 (best match) for the HQI display: with L the compared regions, N the number of used indexes, and p i the index from the experimental (EXP) spectrum and the database spectrum (DB).
The second algorithm relies on the calculation of the absolute distance based on the difference for each data point with respect to the total area under the curve (i.e., the sum of all the intensities). In other words, each data point represents a percentage of the total area.
Thus, the larger the intensity value of a data point, the larger its weight: Only one HQI algorithm is available for "peak searching" calculated as follows: with L the total number of peak regions in the experimental spectrum, N k the number of peaks in the k-peak region of the experimental spectrum, and M k the number of peaks in the k-peak region of the DB spectrum.
Finally, for improving the database search, two parameters can be chosen by the operator. The first one is the HQI threshold value under which the results are filtered out. The second one, only available for "peak searching", is the looseness factor (LF) that allows considering the variability in the chemical shifts of the different peaks obtained in NMR. LF corresponds to the highest difference in ppm which can be accepted to provide a match between the experimental spectra and the database (both in 1D and 2D). Note: NS is the number of scans, DW is the dwell time, AQ is the AcQuisition time, TR is the repetition time (including the acquisition time and the recovery delay), and NI is the number of increments in the indirect dimension of 2D HSQC spectra. T exp indicates the experiment time.

| Analytical time
The overall data acquisition and analysis times should be kept as short as possible, to remain compatible with the typical duration of a police custody (48 to 72 h in France, for instance). In this study, optimized NMR experiments last a total time of 113 min in total for each sample (see Table 1) and IR experiments take less than 1 min per acquisition.
Finally, the data processing steps following acquisition are not very time-consuming (a few minutes per experiment at most). Thus, for identification, the largest share of the processing time originates in the acquisition of analytical data. In contrast, for the elucidation of unknown NPS, the analysis of the spectral data by a trained operator is likely the rate-limiting step, which is also dependent on the complexity of the problem to solve.

| SWDRUGS recommendations
The choice of creating a database containing IR and NMR analytical data is perfectly justified by the recommendations of the SWDRUGS group. 14  This experiment allows separating unknown molecules into two categories, those with 19 F and those without ( Figure 2). Second, 1 H-13 C HSQC was used as it is one of the most sensitive and informative heteronuclear 2D pulse sequences. 2D HSQC spectra were acquired with spectral edition which allows to label the 13

| Multi-technique workflow
The choice of the analytical workflow is essential regarding the complexity of the sample embedding NPS. The strategy to build for identifying and elucidating NPS in seized samples has to face difficulties met to detect and quantify NPS even in mixture. Since high-field NMR and HRMS are expensive instruments and not owned by a majority of forensic laboratories, there is a need for cheaper and more accessible alternative workflows. We propose 1D and 2D optimized experiments implemented on liquid-state benchtop instruments in combination with IR and dedicated associated databases to identify and also elucidate NPS. Note that our approach aims at being quite general and may not be adapted to NPS hidden or sprayed in materials that would require a more specific strategy.
For our approach, an experimental database was created with 57 entries, mainly coming from cathinones, cannabinoids, amphetamines, arylcyclohexylamines, and fentanyloids (see Table S1). Before elaborating the multi-technique workflow, three model molecules were used to determine the best search method, the optimum algorithm for HQI calculation, the most adapted HQI threshold, as well as the optimum LF value. Table 2 summarizes the optimized parameters, details of the optimization process being provided in the SI (see S4 and S5).
As described above, the workflow starting point is the 1D 19 F analysis to determine if the molecule possesses one or several 19 F ( Figure 2). Then, comparison between the measured HSQC spectrum and the database is carried out. The result of this comparison dictates the choice between the identification and elucidation parts of the workflow. When the HSQC pulse sequence gives a single match for the unknown structure, the leftmost part of the workflow in Figure 2 is used, corresponding to identification. Nevertheless, according to the SWGDRUG guidelines, one needs to ascertain the structure of the NPS with another technique. This is done by recording the IR spectrum. There is also a small chance that HSQC provides several matches or that not all signals are assigned. In this case, both 1D 1 H NMR and IR can be used to determine the most probable structure.
The rightmost side of the workflow displayed in Figure 2

| Database validation on real cases
A blind validation of the workflow was carried out on a set of six unknown seizures (numbered from 1 to 6) to evaluate if the database would be able to cope with concrete cases and could be routinely   Three different positional isomers were possible in that case, with the Cl center in ortho (Figure 3c), meta (Figure 3b), or para (Figure 3a), and prediction was therefore needed to identify the structure. In the HSQC spectrum three correlations could be peak picked 8.02/132.4, 7.58/127.9, and 7.60/131.7 ppm, the latter apparently corresponding to two different protons (the signal intensity was much higher for this one than for the two others correlations). With those information at hand, para substitution could already be discarded as only two correlations would be expected for this isomer. Next, the Predictor ACD/Labs module was used to predict the chemical shift of the three isomers ( Figure 4). The closest match with the actual measured spectrum of sample no. 4 was obtained for the ortho isomer; this molecule is called NEK (Figure 3c). For further validation, the structure was later confirmed by high-field experiments performed at 700 MHz.
The workflow for sample no. 5 started directly with HSQC since no signal was observed in the 1D 19 F spectrum. No match could be found with the identification parameters given in Table 2. As above, we incremented LF until 0.5 ppm for 1 H and 5 ppm for 13  propyl (PVP, MDPV) and hexyl (PV9). However, visually, those spectra strongly differed from the one of our unknown structure. In addition, unknown compound no. 5 could not contain only methyl or ethyl groups since the NMR spectra would respectively show a doublet or a triplet coupled to a quadruplet in these two cases (see S10 Figure S1).
IR comparison was also made, and the same structures as before were identified as the closest to our structure. Finally, the HSQC prediction was done considering a butyl side chain (3 CH 2 and 1 CH 3 ) and it returned a map closely resembling one of the unknown samples ( Figure S10a). Thus, the structure was identified as MPHP with a butyl side chain, and this was confirmed by 700 MHz experiments (see S10 spectra were compared, it could be seen that additional signals were present in sample no. 6 (see S11). The presence of a mixture explains why LF needed to be increased in order to obtain a match with HSQC.
The two components of this mixture were determined and validated at 700 MHz (see S11): It turns out that the sample is a mixture of NEK and MDMC. This shows that although benchtop NMR failed to identify all the analytes in the mixture, the major component (NEK) could be identified relying on our workflow.
To summarize, six cases were examined using the previously constructed workflow. In the first three cases, AF, AB-FUBINACA, and a mixture of 2C-I and 2C-E, were straightforwardly identified with the help of their reference spectra in the database. The two next samples were elucidated with the help of the database to determine the structures of NEK and MPHP (see S10 and S11). Finally, the last sample could not be fully characterized by our workflow which nevertheless identified NEK as the major component of the mixture. The strategy based on structure similarity was also reported with LC/ESI/HRMS in the field of a non-targeted screening. 21 This approach also made it possible to access semiquantitative data of main interest for tracking drugs.

| Quantification
To complete our study, the purity of the fully characterized NPS was measured by quantitative benchtop NMR. Purity determination is an additional data to trace the manufacturers and to further dismantle drug networks. It is generally performed by GC-FID or LC-DAD analysis, but these techniques require finding the corresponding reference material, which is not systematically accessible and can also be prohibitively expensive. Therefore, NMR appears as a valuable alternative because, on the one hand, quantification can be achieved without physical separation of the sample components, and on the other hand, multiple compounds can be precisely quantified with a single internal or external reference, which does not necessarily have to be close to the analyte (in contrast to chromatography). An internal reference was not used here since avoiding overlap with NMR signals was beyond reach for the benchtop apparatus. Therefore, the choice was made to determine the purity using TSP as an external reference. Purity determination was carried out relying on the following equation: where P TSP is the purity of the external standard TSP, A Then, the precision of the purity was determined on five successive spectra for each sample, and it was calculated with the following formula.
where CV A x ð Þ is the standard deviation of the reference integral determined with five repetitions, CV A TSP ð Þ is the standard deviation of the analyte integral determined with five repetitions, Δm x and Δm TSP equal to 0.01 mg, and m x and m TSP have been define above.
NMR quantification with 1D 1 H benchtop NMR can be difficult, because it is necessary to find at least one signal per compound which is not overlapped with others and where the baseline reaches zero on both sides to reach an accurate (true and precise) quantification. Thus, purity of the substance, nature of the substance, and sample/matrix interferences could impact the accuracy of the quantification. For each of the previous samples, the signal which was the least overlapped with others was used (see S12). Calculated purity values are displayed in Table 3. The two pure identified structures have a purity higher than 90% with a precision of approximately 2%. For the identified mixture, the purities of 2C-I and 2C-E were determined as 66% and 21%, respectively, with again a precision close to near 2%. MPHP was determined to be pure at more than 90%. Finally, the purity of NEK was determined to be much higher that 100%, and this awkward result is related to the absence of signal free of overlapping in the 1D spectrum. Signals integrated for the determination of purity belonged to the aromatic region which were presumably overlapped with signals of a minor component that could not be detected with a magnetic field under 100 MHz (some signals could be detected but not identified at

| CONCLUSION
An integrative workflow was created to help identifying NPS. It includes benchtop 1D and 2D NMR, IR, and prediction together with an experimental database to characterize NPS according to the directives of the SWGDRUGS workgroup. The database includes data for 57 compounds but will be enriched with more samples in the near future. The efficiency of the workflow was evaluated on six seizures.
Most structures were unambiguously identified or elucidated, even in the case of mixtures. This evaluation also highlighted the limitation of the workflow when significant overlap between peaks occurs. In some cases, an expert or habituated eye will most likely be needed, but we are confident that the workflow presented in this study would facilitate the identification process.
Additionally, purity was also determined for each sample thanks to 1D 1 H NMR. Precise purities were obtained except for one case where the NMR signal was heavily overlapped with the one of another molecule. To the best of our knowledge, this work stands as the first example of using benchtop NMR in an integrated workflow for identifying, elucidating and quantifying NPS. This workflow can serve as an alternative to the use of GC-MS: It could allow forensic laboratories to efficiently characterize and quantify a wider panel of drugs and especially NPS.
The present workflow and database work well for simple elucidation cases, for more difficult ones, the creation of a bigger database including more reference structures is needed. In addition, the proposed workflow could be combined with the current workflows already used by the police, by including GC-MS in the database to maximize its potential. Another appealing perspective would be to combine those developments with machine learning, to create an even more automated and reliable elucidation procedure.